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Abstract 

The minimal sets within a collection of sets are defined as the ones 
which do not have a proper subset within the collection, and the max¬ 
imal sets are the ones which do not have a proper superset within the 
collection. Identifying extremal sets is a fundamental problem with a 
wide-range of applications in SAT solvers, data-mining and social net¬ 
work analysis. In this paper, we present two novel improvements of the 
high-quality extremal set identification algorithm, AMS-Lex, described by 
Bayardo and Panda. The first technique uses memoization to improve the 
execution time of the single-threaded variant of the AMS-Lex, whilst our 
second improvement uses parallel programming methods. In a subset of 
the presented experiments our memoized algorithm executes more than 
400 times faster than the highly efficient publicly available implementation 
of AMS-Lex. Moreover, we show that our modified algorithm’s speedup 
is not bounded above by a constant and that it increases as the length of 
the common prefixes in successive input itemsets increases. We provide 
experimental results using both real-world and synthetic data sets, and 
show our multi-threaded variant algorithm out-performing AMS-Lex by 3 
to 6 times. We find that on synthetic input datasets when executed using 
16 CPU cores of a 32-core machine, our multi-threaded program executes 
about as fast as the state of the art parallel GPU-based program using an 
NVIDIA GTX 580 graphics processing unit. 


1 Introduction 

1.1 Motivation 

The problem studied in this paper is that of finding the extremal sets within 
a dataset (family of sets) D. The extremal sets of D are all the sets in D that 
are maximal or minimal with respect to the partial order induced on D by the 
subset relation. 

Finding extremal sets is a fundamental problem and has many motivating 
applications. For example, large-scale SAT solvers use extremal set identifica¬ 
tion as an optimization step [T,. Extremal sets are also used for performing 
itemset support queries in data mining [2j, and social network analysis ,3], as 
well as in trajectory-based query algorithms with applications in surveillance 
. Early theoretical algorithms were motivated by problems in propositional 
logic 0- 
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We find our inspiration for working on the problem of finding extremal sets 
in the domain of searching for optimal depth sorting networks. Bundala et 
al. [6] describe a method (Lemma 2 in Section 3.2) for reducing the search 
space by considering only the output minimal networks within a collection of 
outputs of comparator networks of the same depth. Although, Bundala et al. 
present a stronger search space reduction technique — output minimal up to 
permutation - the problem of finding the minimal itemsets within a dataset 
is used a preliminary reduction step. The reason being that the minimal up to 
permutation problem is GI-Hard [7] and the minimal itemset problem is known 
to be sub-quadratic [5]; hence, one would use the output of the latter as an input 
to the former. The algorithm described in this paper was initially developed 
to find such output-minimal networks (itemsets) within a dataset and hence, 
our discussion and examples focus on finding the minimal itemsets. However, 
as with Bayardo and Panda’s state of the art practical algorithm AMS-Lex [3], 
our approach can be used to compute minimal or maximal itemsets. 

and hence it is aimed at finding the minimal itemsets and not the maximal 
ones — as per Bayardo and Panda’s state of the art practical algorithm AMS- 
Lex [3] for finding extremal (minimal or maximal) sets within a dataset. 

In this paper, we present two optimization techniques that we apply to the 
AMS-Lex algorithm to achieve a faster execution time — the first one uses 
memoization and the second one parallel programming techniques. The mem- 
oization technique is aimed at speeding up the AMS-Lex algorithm for finding 
the extremal itemsets within datasets containing a large number of common pre¬ 
fixes — such as the ones found in the sorting networks domain. The presented 
parallel version of AMS-Lex is aimed at utilizing more of the CPU resources 
that are generally available in modern day computers. Using experimental eval¬ 
uation we demonstrate the speedup achieved of both of them when compared 
to the highly efficient implementation of the AMS-Lej0 algorithm described by 
Bayardo and Panda [3]. 

Given that AMS-Lex ‘is easily modified to find minimal itemsets’ [3], without 
loss of generality, in this paper we focus on finding the minimal itemsets within 
an input dataset. We give full explanation on how exactly AMS-Lex is to be 
modified to find the minimal itemsets - rather than the maximal itemsets [3 - 
in Section[2] Furthermore, since our optimization techniques build on top of the 
existing algorithm (and implementation) of AMS-Lex the presented modification 
of AMS-Lex can be easily transformed to find the maximal itemsets. 

1.2 Related Work 

We denote by N the sum of the cardinalities of all the sets in the input 
dataset D } and informally refer to it as the size of the input. Although the 
algorithms for computing extremal sets are almost quadratic in N in the worst 
case, due to the nature of datasets in applications, practical algorithms can 
operate efficiently for very large N [3]. In this paper we provide experimental 
results for N = 7.2 x 10 s . 

Yellin [3) described algorithms for maintaining a dynamic family of sets, 
under insertion, deletion, intersection and subset query operations. He presents 

1 Bayardo and Panda have made their implementation of the AMS-Lex algorithm publicly 
available at https://code.google.eom/p/google-extremal-sets/ 


2 



an output sensitive algorithm for identifying extremal sets after a sequence of 
n operations that operates in 0(inn) time, where m is the number of maximal 
sets. Note that n is the sum of N and the number of sets in the dataset, and 
hence n > N. 

Early sub-quadratic time algorithms for finding extremal sets were described 
by Yellin and Jutla m, operating in 0(N 2 /log N) expected time, and by 
Pritchard [5] who provided a matching worst-case time bound. Pritchard [5j 
described the first algorithms that required sub-quadratic space, providing al¬ 
gorithms requiring 0(N 2 /log N) space. 

Sheni and Evans m also studied algorithms for maintaining a dynamic 
family of sets, operating in time 0(N 2 / log 2 N) and requiring 0(N 2 / log 3 N) 
space. We do not study this dynamic version of the extremal set problem in 
this paper. 

Pritchard [8j described the first algorithm to make use of a lexicographic 
ordering of the input sets. Among the practical algorithms for computing ex¬ 
tremal sets is the highly efficient implementation of the AMS-Lex algorithm 
described by Bayardo and Panda [3]. AMS-Lex is the state of the art practical 
algorithm for finding extremal sets that is designed to run on commodity CPUs. 
In this paper we give a detailed explanation of AMS-Lex in section [2] as it is the 
basis point of our work. 

Fort et al. m described a highly parallel algorithm designed specifically for 
graphics processing units (GPUs). Fort et al. sihow that their parallel algo¬ 
rithm running on a GPU can outperform AMS-Lex running on single core of 
a conventional CPU. The single-threaded algorithm we described in this pa¬ 
per is targeted at running on an ordinary commodity CPU and therefore we 
compare its performance to the algorithm of Bayardo and Panda [3]. In the 
experimental evaluation section [5] we compare the execution time of our two 
new algorithms to Fort et al. reported execution time by evaluating on 

synthetically generated datasets. 

1.3 Contributions 

The main contributions of this work can be summarized as: 

• A memoized version of AMS-Lex that takes advantage of common prefixes 
among itemsets. 

• We outline a parallel modification of the AMS-Lex extremal sets algo¬ 
rithm. 

• We present experimental results over both real-world and synthetic data 
for both the memoized and parallel modifications of the AMS-Lex ex¬ 
tremal sets algorithms. We find that the speedup of the memoized algo¬ 
rithm increases as the length of the common prefixes of itemsets in the 
input dataset increases. Also that, the speedup of the parallel algorithm 
increases as the number of CPU cores used increase. 

2 Background 

Practical algorithms for computing the extremal sets of a dataset D assume 
that the elements of D are sets of items , called itemsets. Furthermore, these 
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algorithms assume that there is an ordering on the itemsets themselves. An 
input to an extremal set algorithm is then an ordered multiset of itemsets, 
referred to as a dataset D. 

The choice of the ordering on the itemsets gives rise to alternative algorithms 
for computing extremal sets. For example, if itemsets are ordered by cardinality 
then the simple observation that if itemset a is a proper subset of itemset b 
then the cardinality of a is less than the cardinality of b can be used to prune 
the search space. This gives rise to an algorithm referred to as AMS-Card by 
Bayardo and Panda [3:. 

Pritchard [5j exploited a lexicographic ordering of itemsets to obtain more 
efficient algorithms for identifying extremal sets. In particular he noted the 
following: 

Theorem 2.1. Let a and b be itemsets such that a C b then either a is a proper 
prefix of b or a is lexicographically larger then b. 

The most efficient practical algorithm, AMS-Lex, for identifying extremal 
sets, described by Bayardo and Panda [3]> makes use of this lexicographic or¬ 
dering of the preceding property to substantially prune the search space. In 
order to present our improvements we must first describe in detail the AMS- 
Lex algorithm. 

2.1 The AMS-Lex algorithm 

In this section we reproduce the AMS-Lex algorithm, we re-use the notation [3] 
when referring to the input ordered dataset D: 

• D\i\ denotes the i th itemset in D 

• -D[*][j] denotes the j th item of itemset D[i\. 

• D[i : j] denotes the ordered multiset of itemsets { D[k] \k = i...j} in that 
order. 

• D[i][j : k] denotes the ordered multiset of items {£>[*][/] | l = j ... k}. 

We also re-use Bayardo and Panda’s subsumed notation: an itemset A is 
subsumed by I? iff A is a subset of B. 

The pseudo code of the AMS-Lex algorithm itself is shown in Algorithm [2j 
and it applies the result from of Theorem |2.1| directly to first identify the 
proper prefixes that are subsumed by lexicographically smaller itemsets, and 
then searching among the remaining itemsets using Contains-Subset-Of. The 
function Contains-Subset-Of takes as input an itemset S and dataset D and 
returns all x £ D such that x C S and x is lexicographically larger than S. 
Contains-Subset-Of makes use of the common prefixes of itemsets in D as well 
as the lexicographic order of D. Since the items in the itemsets themselves are 
ordered lexicographically, the functions NextBeginRange, NextEndRange, and 
Nextltem can be implemented using binary search. 
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2.1.1 Example 


Figure [Tj presents the call graphs (as per Definition 3.1) of the AMS-Lex [3] al¬ 
gorithm for finding the minimal itemsets over the dataset D = {D\ = abc , Di = 
abde ) D 3 = abdf,D 3 = bd,D 5 = c}. Looking at the figure we can see that the 
minimal itemsets are D 3 and D$; also that D\ A D 3l D 2 A D 4 and D 3 A D 4 . 
The dataset D is chosen such that every line of the function Contains-Subset- 
Of is executed at least once thus handling all cases of Bayardo and Panda’s [3] 
AMS-Lex algorithm. 


2.1.2 Contains-Subset-Of Explanation 

The Contains-Subset-Of function exploits the common prefixes of itemsets in D 
by taking advantage of the lexicographic order of D. The function is designed to 
efficiently find all itemsets in the range D[b : e] that are subsets of S (i.e., that 
are subsumed by S). The itemsets in D are processed in ranges which share a 
common prefix of length at least d. 

The first thing we check in the function is if the next item (D[b][d +1]) is 
contained in S by finding the first element of S which is greater than or equal 
to D[b][d+ 1]. If all elements of S smaller then D[b][d+ 1] we can safely deduce 
that there are no subsumed itemsets by S in the range D[b : e]. This is because 
all itemsets in D[b : e] are ordered lexicographically in ascending order. Hence 
if S[\S\] < D[b][d + 1] then S[\S\] < D[i][d+ 1] for all i in the range [b, e]. Hence 
we reach a state where we know that the element S'[j] > D[b][d + 1]. 

If S'[ 7 ] = D[b][d + 1] then we know that it is possible for D[b] to be a subset 
of S. Hence we have to make a recursive call to Contains-Subset-Of. In order 
to do this we have to first find a new end range e' such that all elements in 
D[b : e'] have a common prefix of length at least d + 1. Then check if there are 
any subsumed itemsets. Next we check if the requirements of the recursive call 
to Contains-Subset-Of that we want to make are met. If this is the case then we 
mark subsumed items by S in the range D[b : e]. Since we have already covered 
the range D[b : e!\ we set the current start of our range b to e '. 

If S[j] > D[b\[d + 1] then we know that D[b] cannot be subsumed by S. 
Hence we search for the first element in D[b : e\ which has a value at index 
d + 1 greater then or equal to S[j], this operation is referred to as subroutine 
NextBeginRange. 

Lastly we check if the current begin range is smaller then the current end 
range and if it is the case we mark all subsumed sets of S in the range D[b : e\ 
by making a recursive call to Contains-Subset-Of. 


3 A Memoized Algorithm for Identifying Ex¬ 
tremal Sets 

The AMS-Lex [3] algorithm uses a frequency based item ordering to reduce the 
probability of itemsets sharing long common prefixes. Nonetheless, AMS-Lex 
takes advantage of common prefix shared between consecutive itemsets. More 
precisely, in the definition of the function MarkSubsumed [3] and its variant 
presented here — Contains-Subset-Of (Algorithm [l]) ; they both have the argu¬ 
ments D[b : e], S, j, d with the restriction that all itemsets I £ D[b : e\ must 
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share a common prefix of size d. Hence, even after the item-based frequency 
ordering of the input dataset, common prefixes are expected to occur, otherwise, 
this logic would not have been included in AMS-Lex by Bayardo and Panda. 
Therefore, the current state of the art practical algorithm AMS-Lex exploits 
and takes advantage of the common prefixes between itemsets. 

The observations and memoization technique that we present in this section 
are all based on the common prefixes shared by itemsets — we take it a step 
further than Bayardo and Panda by analysing the behaviour of successive calls 
to the function Contains-Subset-Of (MarkSubsumed) by two itemsets S' and 
S" which share a non-empty common prefix; whereas the current approach [3] 
focuses on the efficient implementation of the function Contains-Subset-Of. 

3.1 Observations 

Our improved algorithm for extremal set identification memoizes successive calls 
to the function Contains-Subset-Of, defined in Algorithm [l] As we explain be¬ 
low, Bayardo and Panda’s algorithm AMS-Lex presented in Algorithm [2] du¬ 
plicates work in successive calls to Contains-Subset-Of where itemsets share a 
non-empty common prefix. We now show more precisely the duplicated work, 
in terms of the call-graphs resulting from successive calls to Contains-Subset-Of. 

Definition 3.1. The directed call graph of an itemset S and the function 
Contains-Subset-Of(D[b : e\,S,j,d) is defined as a graph G(S) = (V,E), where 
V = {(&, e, S, j, d) | b,e,S,j and d meet the input requirements of Contains- 
Subset-Of}, and (v 1 , 02 ) £ E iff Contains-Subset-Of{v\.b, v\.e, V\.S, v\.j, v\.d) 
makes a recursive call to Contains-Subset-Of{v 2 -b , v^.e, V 2 -S, V 2 -j, V 2 -d). 

Remark 3.2. Note that since the Contains-Subset-Of function in Algorithm Q] 
performs at most two recursive calls, hence the out-degree of any vertex in a 
call-graph G(S) is at most two. 

Notation 3.3. For a call graph G(S) = ( V,E ) and any v = (6, e, S,j,d) £ V, 
we refer to the values of v as v.b, v.e, v.j and v.d; and we refer to the children 
of v as v.c 1 and v.C 2 - We denote v.t as a boolean field which is true iff there 
exists a subset of S in the range [v.b-, v.e] that is of size v.d+ 1. We denote v.m 
as the maximum index that is accessed from the itemset S without considering 
any recursive calls of Contains-Subset-Of. 

Remark 3.4. Note that at any single call-graph node corresponding to a call 
to function Contains-Subset-Of{D[b : e],S,j,d) the only indices of S that are 
required are those between j and NextItem(S, j, D[b][d + 1]). Hence, we can see 
that v.m is bounded above by NextItem(S, j, D[b][d + 1]). 

Lemma 3.5. Let S and T be itemsets with a common prefix P. Let G(S) = 
(Vs,Es) andG(T) = ( Vt,Et )■ Suppose thatv\,V 2 £ Vs, wherev 1 = ( b,e,S,j,d ), 
and V 2 = (b' ,e', S,j',d') such that j' < |P|, and that (rq,W 2 ) £ Es- Then 
(wi ,W 2 ) £ Et where W\ = ( b,e,T,j,d) and W 2 = {b r ,e',T,j',d'). 

Proof. Referring to Algorithm |Tj note that because S and T have a common 
prefix P of length greater than j 1 all requirements of Contains-Subset-Of are 
met for the inputs represented by w 1 and W 2 ■ Hence we have w±,W 2 £ Vr- We 
now need to show that there is an edge between w 1 and W 2 ■ Since (ui, vf) £ E$ 


6 


and from Remark |3.4| the only values required of S by Contains-Subset-Of are 
in the range [j, j'} and as a result of the further assumption that f < \P\ it 
follows immediately that (wi,w 2 ) G Et • □ 


Remark 3.6. Note that for any itemset S , the call graph G(S) = (V, E) is 
acyclic because in all recursive calls to Contains-Subset-Of the range [6, e] gets 
smaller, S is always constant, j increases and d increases. 


Notation 3.7. For any itemset S, we refer to the subgraph of G(S) = ( V,E ) 
identified by V' = {( b,e,S,j,d ) € V \ j < |P|} as G(<S')|j < |p| • 


Corollary 3.8. Let S and T be itemsets with a common prefix P. Then 

G(S)\ j<lP \= G(T)\j: <| P |. 


Proof. Use induction to apply Lemma 3.5 multiple times starting from the root 
of G(S) identified by the vertex (b, e, S, j = 1, d = 0). □ 


3.2 Algorithm 

The pseudo code of our modified algorithm for identifying minimal sets is pre¬ 
sented in Algorithm [4] and we now give an informal description of its behaviour. 
For each call made to Contains-Subset-Of(D[z +1, n], D[i\, 1,0) we memoize the 
call graph G(D[i\) of the execution path. When we get to the point when we 
need to find if there is a subsumed itemset by D[i + 1] we first identify the 
common prefix P of D[i\ and D[i + 1], Then we traverse G(D[i\) using depth 
first search. For each vertex v we check if a recursive call is made to Contains- 
Subset-Of with some j > \P\. If this is the case then we execute the function 
Contains-Subset-Of with input v; otherwise we recursively traverse the children 
of v. This is a direct result from Corollary |3.8| In practice we note that, we 
need not memoize the full call graph G(D[i]) as we are only ever going to use 
nodes w € G(D[i]) for which w.j < |P|. 

Remark 3.9. It is important to note that we use a modified version of the 
function Contains-Subset-Of by assuming that it returns a pair of a boolean 
result as per the specification from Algorithm^ 7] and the call graph representing 
its execution path. We use this in the pseudo code of the memoized version of 
the memoized version of AMS-Lex presented in Algorithm [^J 


3.2.1 Example 

The sample dataset that the memoized algorithm is evaluated on in Figure [2] 
is the same as the dataset that AMS-Lex is evaluated on in Figure |I] The call 
graphs in Figure [2] present visually exactly which parts of the call graphs are 
memoized — the shaded nodes — by keeping track of the memoized call graph 
— variable v in Algorithm [4] 

We see that in general, the memoized call graph of an itemset Di could be 
used when processing itemset Di +X for any integer x > 0. In our presented 
example in Figure [2] we see that we use part of the memoized call graph from 
Di when processing D 2 and D 3 ; this happens because D \, D 2 and D 3 share the 
non-empty common prefix ab. 

In Figure [2] (c) we see exactly that a subgraph of v gets reset to null - 
line [6] in Algorithm [3] That is when D 3 is processed, the memoized nodes n 3 
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and ri 4 from Figure [2] (b) are removed in the (new) memoized call graph from 
Figure [2] (c). 

3.3 Complexity Analysis 

Worst Case Time Complexity It is easy to see that in the worst case 
(when no two itemsets have a common prefix), the complexity of our algorithm 
is equal to that of AMS-Lex, that is 0(N 2 / log(-ZV)), where N is the sum of the 
cardinalities of all itemsets in the input dataset. 

Runtime Comparison to AMS-Lex Our algorithm’s run time is clearly 
bounded above by the time required by AMS-Lex. Moreover, as the num¬ 
ber of common prefixes among the itemsets increases, the faster (compara¬ 
tively) our algorithm becomes. Essentially by executing Contains-Subset-Of 
fewer times, we save run time consumed by the low level searching routines 
Nextltem, NextEndRange , and NextBeginRange which are the bottleneck 
of the AMS-Lex algorithm as per [3]. 

Space Complexity In addition to the memory required by AMS-Lex, Algo¬ 
rithm [ 2 ] stores (part of) the call graph of Contains-Subset-Of. Clearly the size 
of the call graph is bounded above by the size of the input, denoted as N. Since 
only the required portion of the call graph, as defined by Corollary |3.8[ is stored 
in practice, the extra space required is commonly much less than the size of the 
input. 

3.4 Implementation Details 

We implemented our algorithm as a modification to the publicly available im¬ 
plementation^] of the AMS-Lex algorithm, only introducing the memoization 
described in Algorithm [4j We regard this as valuable since it allows us to di¬ 
rectly measure the improvement in performance resulting from memoization. 


4 A Parallel Algorithm for Identifying Extremal 
Sets 

We use the complexity analysis of the function AMS-Lex |3] to identify the bot¬ 
tleneck of the existing algorithm. In the worst case, finding all proper prefix 
subsumed itemsets takes O (iV) computational steps and finding the remain¬ 
ing non-minimal itemsets takes 0(N 2 /log(N)) > O(A), where N is the size 
of the input. Consequently, the novel work presented in this section is a par¬ 
allel algorithm that finds the non-proper prefix subsumed itemsets of D, i.e. 
we present a parallel implementation of the function Get-Minimal-Itemsets-Lex 
from Algorithm [2] 


^https://code.google.com/p/google-extremal-sets/ 



4.1 Observation 


The first observation we make is that the pseudo code of the function Contains- 
Subset-Of, presented in Algorithm [T] that is a reproduction of Contains-Subset- 
Of [3], does not modify the input dataset D. Hence, this makes the algorithm 
of finding all minimal itemsets within D embarrassingly parallel. 

4.2 Algorithm 

The pseudo code for our parallel algorithm of finding the minimal itemsets 
within a lexicographically ordered dataset is presented in Algorithm [5j 

Entry Point We first mark every itemset within the dataset D as minimal. 
Next, we mark all itemsets as not minimal for which there exists a proper prefix 
subsumed itemset within the dataset. We then start P parallel instances of the 
thread functor whose job is to mark itemsets as non-minimal for which there 
exists a non-prefix (lexicographically larger) subsumed itemset. 

Thread Functor All of the parallel instances of the Thread-Functor function 
share a common integer variable index which points to the next unprocessed 
itemset D [index] £ D within the datasets starting at 1. To process the itemset 
D[index\ means to check if there exists a non-prefix subsumed within D of 
D[index\. We begin by atomically assigning the current value of index to the 
variable i and incrementing index ; ensuring that every itemset in D will be 
processed exactly once by some Thread-Functor. We then use the function 
Contains-Subset-Of from Algorithm [T] to check if a subset of D[i ] is found. 
Finally, we try to take a new unprocessed itemset from D and process it in the 
same manner. 

4.3 Complexity 

Here we give the worst case time and space complexity of the functions presented 
in Algorithm [5] From Bayardo and Panda [3j’s complexity analysis of AMS-Lex 
we know that the worst case time complexity of AMS-Lex is equal to 0(./V) to 
identify the prefix subsumed itemsets and additional 0(N 2 /log(N)) to find the 
non-prefix subsumed ones; recall that N denotes the sum of the cardinalities of 
all the sets in the input dataset D. Since in this section we showed that, the 
function Contains-Subset-Of requires only read-only access to the dataset D and 
we have P threads at our disposal we deduce that worst case execution time 
of the function Get-Minimal-Itemsets-Lex-Parallel is O(AT) + 0(N 2 /(log(N) x 
P)) = 0(iV + N 2 /(log(N) x P)); note that i < P < n. As for the space 
complexity of the Get-Minimal-Itemsets-Lex-Parallel algorithm it is equal to 
that of Get-Minimal-Itemsets-Lex [3: which is proportional to the size of the 
input, i.e. O(iV). 

5 Experiments 

Here we describe the experimental comparison of our algorithm with Bayardo 
and Panda’s algorithm AMS-Lex for identifying the minimal itemsets within a 
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dataset. We measure execution time speedup as the ratio of AMS-Lex algorithm 
execution time divided by our algorithm’s execution time. Hence, a speedup 
of 2 means that our algorithm executed in half the time, and a value of 1 
means that both algorithms have the same execution time. For every input, 
we also measure the total number of calls that each algorithm made to the 
subroutines NextBeginRange and NextEndRange , because as described in 
l[3j, these subroutines are the bottleneck of the AMS-Lex algorithm. In our 
experimental evaluation we provide a link between the decrease in the number 
of range searches performed by our algorithm in comparison to AMS-Lex and 
the relative to AMS-Lex execution time speedup. 

Although not presented below, we also conducted experiments with the Ba- 
yardo and Panda’s AMS-Card Algorithm on all of the data and it performed 
slower on all cases, compared to the AMS-Lex algorithm. That is expected, as 
stated by Bayardo and Panda [3], the cardinality approach is faster then the 
lexicographic one mostly primarily in very obscure and rare cardinality distri¬ 
butions. Furthermore, the goal of this paper is to present faster than AMS-Lex 
methods of finding extremal sets that are based on Pritchard’s lexicographic 


subsumption property from Theorem 2.1 


5.1 Experimental Setup 

For all of our experiments we used a machine with four Intel Xeon CPU E7- 
4820, each with eight cores, clocked at 2.00 GHz, a third level cache size of 
18 MB and 128 GB of main memory. Note that our experiments investigate the 
case when the entire input fits in main memory. We used uniform random data 
as well as publicly available data as input to evaluate our two new algorithms 
and AMS-Lex. All of the results presented below are averaged over 3 different 
runs. 


5.2 Real-World Data 

A summary of the conducted experiments using real-world input datasets is pre¬ 
sented in Figure [3] We have evaluated the AMS-Lex algorithm, our memoized 
approach and the parallel method using different degrees of parallelism over the 
real-world datasets: 

• PubMed dataset represents significant terms in the PubMed abstract. It 
consists of 8 million itemsets stored in a 2GB file. 

• DBLP dataset consists of 1 million itemsets and is used in the area of 
similarity joins. The file size is 50 MB. 

• SN 9 4 dataset consists of 2 million itemsets with an average size of 30.3 
and an alphabet size of 2 9 . This data is derived from the domain of 9- 
input sorting networks by generating all non maximal networks of depth 

4 . The file size is 252 MB. 

• SN_9_5 dataset consists of 7.5 million itemsets with an average size of 
18.1 and an alphabet size of 2 9 . This data is derived from the domain of 
9-input sorting networks by generating all non maximal networks of depth 

5 by using the minimal ones of depth 4. The file size is 578 MB. 
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Sorting Networks Datasets Here we give explanation on how the datasets 
S7VJL4, SNAA were generated. We refer to the work of Bundala et al. [5] 
(Lemma 2 in Section 3.2) about searching for sorting networks of optimal depth. 
They describe a method of reducing the search space by considering ‘output- 
minimal networks’ i.e. given a dataset their algorithm needs to identify and 
consider only the minimal representative itemsets within this dataset. The 
input dataset <SW_9_4 is generated by applying all maximal network levels to 
the minimal outputs (itemsets) of networks of depth three; similarly the dataset 
SN _9_5 is generated by taking the minimal networks of depth four and applying 
all maximal network levels. 

The algorithm described in this paper is originally designed to find such 
output-minimal networks and hence it is aimed at finding the minimal itemsets 
within a dataset and not the maximal ones as per Bayardo and Panda’s approach 
[3j. In the background related Section [2] we describe in detail Bayardo and 
Panda’s AMS-Lex algorithm in terms of finding the minimal itemsets. Bayardo 
and Panda note that AMS-Lex can be used for finding the minimal and maximal 
itemsets and that the changes needed to do one or the other are trivial. We chose 
to work in terms of finding the minimal itemsets within a dataset because our 
algorithm (and source code) is initially build for tackling the sorting networks 
related datasets. 

5.2.1 Memoized vs AMS-Lex 

Figure [3] shows a comparison of the original AMS-Lex and our two modified 
versions for real world datasets. For the DPLP and PubMed datasets the 
memoized approach is marginally faster than the AMS-Lex algorithm because 
there are very few itemset pairs that share a common prefix. On the other 
hand, for the SN_9A dataset the memoized algorithm is 4.06 times faster than 
AMS-Lex; and 2.96 times faster for the SN_ 9_5 dataset. The sorting network 
input datasets tend to share long common prefixes as the size of the alphabet 
is very small compared to the size of the input which favours our memoization 
technique over AMS-Lex. It is important to note that in the sorting network 
datasets there are no trivially subsumed itemsets. 

5.2.2 Parallel vs AMS-Lex 

Note that our parallel algorithm is executed on a machine with 32 physical 
cores and all real-world experimental results are presented in Figure [3] For the 
DBLP dataset we see that the speedup of the parallel algorithm over AMS-Lex 
is about 3.5 for degrees of parallelism P = 4,8 and 16 whereas for P = 32 we 
see a reduced speedup. For the PubMed dataset we see substantial speedup for 
all of the parallelism factors with P = 16 executing 5.6 times faster than AMS- 
Lex. Substantial execution time speedups are evident in the SNA A and SN 9 5 
datasets both of them peeking at P = 16 with maximum speedup factors of 5.3 
and 5.9 respectively. We elaborate more on the explanation of the performance 
differences between the parallel algorithm and AMS-Lex in Section |5.3.3| It 
is important to note that these real-world data execution time speedups are 
comparatively equal and/or better than the ones that [I2’s approach achieves 
over the AMS-Lex algorithm. Hence, we conclude that our parallel version of 
AMS-Lex is faster than original AMS-Lex on real-world data and competitive 
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with the implementation in m- 

5.3 Synthetic Data 

5.3.1 Input Dataset Generation 

We now describe the process of generating random input data using a random 
data generator program g(n , d, f m in)- The input to the generator is the number 
of itemsets n, the number of distinct items d in the alphabet and the minimal 
item frequency /. Then for each of the d items we choose a frequency from the 
range [ fmin , 1] which indicates the number of itemsets which contain this item. 
Then we insert this item to a set of randomly chosen [/?: x nj itemsets. Then 
we use Bayardo and Panda’s open source implementation to sort the input data 
in the format required by the algorithms. Note that the higher the value of the 
minimal frequency fmin the greater the probability that two itemsets will share 
a common prefix. We use the value of f m in to evaluate our hypothesis that our 
algorithm is faster than AMS-Lex on inputs consisting of itemsets sharing large 
common prefixes. 

5.3.2 Memoized vs AMS-Lex 

Figure [4] shows the execution time speedup factor of our memoized algorithm 
over AMS-Lex for datasets consisting n = 100 000, n = 500 000 and n = 
1000 000 itemsets with alphabet size of 40, 60, 80, 100, 120 and 140. We notice 
that as the minimal item frequency increases, the speedup factor increase dras¬ 
tically. The maximum execution time speedup factor of 406 is achieved by a 
dataset consisting of N = 1 000 000 itemsets with alphabet size of D = 140 and 
minimal frequency of F = 0.95. We also note that there is an approximately 
constant correlation between the execution time speedup of our algorithm and 
the factor of reduction in range search calls. That is an expected correlation 
because these low level subroutines are described as the bottleneck of AMS- 
Lex [3]. 

In Section [3] we showed that the more common prefixes that itemsets have, 
i.e. as fmin increases and we keep n and d fixed, the bigger the expected speedup 
factor, which is experimentally verified by this figure. We note that fixing the 
size of the alphabet d and the minimal item frequency f m im i n Figure [4] we see 
that as the number of itemsets n increases, the execution time speedup of the 
memoized algorithm over AMS-Lex increases. Also, if we fix n and f m in we see 
that as d increases the execution time speedup is non-decreasing in all of the 
conducted experiments. 

Another interesting summary of our experiments is shown in Figure [5] which 
gives the execution time speedup with respect to the cardinality of the resulting 
minimal itemsets by presenting three different graphs for n = 100 000, n = 
500 000 and n = 1000 000. Our first impression is that all of the graphs look very 
similar to each other besides the scale of the execution time speedup access. Our 
second observation shows that the largest speedups are almost always achieved 
at the smallest resulting minimal sets count for every d and n. Moreover, as 
d increases the absolute maximum speedup increases as well and all speedups 
tend to 0 when the size of the result is close to the size of the input (0.9 to 
1.0). Reading the graphs in Figures |4][5] we deduce that there is a correlation 
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between the minimal item frequency f m in and the resulting minimal sets count 
— as fmin increases the number of minimal sets decreases. Hence, in Figure [5] 
we observe that as the number of minimal sets increases the speedup decreases; 
and in Figure [4] we see that as f m in increases the speedup increases. 

5.3.3 Parallel vs AMS-Lex 

We have summarised the conducted experiments in Figure [6] which presents the 
execution time speedup of the parallel algorithm over AMS-Lex using degrees 
of parallelism P = 4, 8,16 and 32 on a machine with 32 physical cores. As input 
to the algorithm we used datasets with n = 1 000 000 itemsets with alphabet 
size of 40, 60, 80, 100, 120 and 140; note that these datasets are the same 
as the ones used for experimentally comparing the memoized approach versus 
AMS-Lex consisting of one million itemsets. From the figure, we see that as d 
increases and keeping n and fmin fixed we see that the execution time speedup 
increases, but it does tend to reach maximum unlike the analogous comparison 
of memoized over AMS-Lex. We note very small difference in the speedups with 
P = 8 and P = 16, whereas as they are both slightly larger then the speedups 
achieved using 4 threads. 

It is very interesting and important to note that in the case of P = 32 
we have a significant decay in the speedup over AMS-Lex in comparison to 
P = 4,8 and 16. Also, this is the only example we encountered that any of 
our algorithms is even by a very small amount slower (speedup smaller than 1 
on the graphs) than AMS-Lex. That is explained with the fact that the AMS- 
Lex algorithm and all of its variations presented here are not computationally 
intensive but rather memory read access bounded. In this case when P equals 
the number of physical cores, we found more L3 cache misses in comparison to 
smaller parallelism factors P; also there is a competition for the memory bus 
and as P increases we inevitably hit the limit of the bus. The cache locality 
and the memory insensitivity of the application arguments also explains the 
observed maximum speedups of around 4 because the machine we used consists 
of 4 physical CPU chips, each with its own L3 cache. 

5.3.4 Comparison to Fort et al. GPU Approach 

Fort et al. algorithm for finding extremal sets on a GPU is compared to the 
AMS-Lex algorithm in [12] . By carefully analysing the experimental comparison 
of Fort’s algorithm to AMS-Lex, we see that when we exclude the time to pre- 
process and sort the input dataset to the required format by AMS-Lex then Fort 
et al. algorithm is between 4 and 5 times faster than AMS-Lex when evaluated 
on synthetic data. Moreover, the execution time speedup demonstrated by 
the Fort et al. algorithm seems to be constant over AMS-Lex. As presented in 
Figure[6j our parallel algorithm is between 3 and 4.5 times faster than AMS-Lex 
when executed with P = 16 on a 32 core machine which is similar to the speedup 
of Fort et al. algorithm over AMS-Lex. One the other hand, the speedup of 
our memoized approach over AMS-Lex is not bounded above by a constant 
as demonstrated. The execution time speedup of our memoized method for 
datasets with 1 000 000 itemsets over AMS-Lex is as high as 400 which is much 
bigger than any speedup reported by Fort et al. [[[2] over AMS-Lex. 
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6 Conclusion 


This paper has presented two improved algorithms for identifying extremal sets 
within a dataset. We have experimentally demonstrated that both techniques 
improve the performance of the AMS-Lex algorithm on both real world and 
synthetic datasets. Our first improved algorithm uses memoization to remove 
redundant work from the AMS-Lex pj requiring at most twice the memory of 
AMS-Lex. In a subset of the conducted experiments the memoized algorithm 
executes more than 400 times faster than AMS-Lex. We show in theory and 
practice, that the efficiency of this improved algorithm increases as the com¬ 
mon prefixes shared by itemsets increases, hence the speedup when compared 
to AMS-Lex is not bounded above by a constant which is also evident in the ex¬ 
periments provided. The second improved algorithm uses parallelism to speedup 
the AMS-Lex algorithm. In the conducted experiments we show that our par¬ 
allel approach outperforms Bayardo and Panda’s implementation of AMS-Lex 
on both real-world and synthetic datasets. Our parallel approach is competitive 
with Fort et al.’s approach running on a highly parallel GPU. 
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ALGORITHM 1: Pseudo code for finding if the input dataset D = {D i, D 2 ,..., D n } 
contains a proper subset of S. A reproduction of the function MarkSubsumed, de¬ 
scribed by Bayardo and Panda, but used for finding the minimal itemsets rather than 
the maximal ones, i.e. for finding the minimal itemsets within the dataset D we do 
not mark the subsumed itemsets but rather return true if a properly subsumed itemset 
by S exists within D and false otherwise. 

Function Contains-Subset-Of {D\b : e], S, j, d) 

Input: The ordered multiset of itemsets D[b : e], an itemset S and two integers j 
and d where b < e and 1 < j < |5| and 1 < d < \D[b]\. The parameter j 
specifies we need only consider S[j : 151] and d is the size of the common 
prefix shared by all T G D[b : e] and S. 

Output: Returns true iff there exists a proper subset of S within D[b : e], and 
false otherwise. 

1 if S\j] < D[b][d + 1] then 

2 j <— NextItem(S, j, D[b][d + 1]); 

3 if j is null then 

4 return false ; 
end 

end 

5 if 5[j] = D[b][d + 1] then 

6 e! <— NextEndRange(D[b : e], S[j], d + 1); 

7 if |5|> d + 1 and |D[&]|= d + 1 then 

/* D[b] is a proper subset of S. */ 

8 return true ; 
end 

9 if j + 1 < | .S' | then 

10 if Contains-Subset-Of (D[b : e'],S,j + l,d+ 1) then 

11 return true; 
end 

end 

12 b <— e' + 1; 

else 

13 b <— NextBeginRange(D[b : e],5[j],d); 

/* When there is no element in D[b : e] that has a value greater 

than or equal to 5[j] at index d + 1 then the function 

NextBeginRange(D[b : e], S[j],d) returns e+1; i.e. we can safely 
deduce that there is no subset of S within the collection 
D[b : e]. */ 

end 

14 if b < e then 

15 return Contains-Subset-Of (D[b : e],S,j,d); 
end 

16 return false; 
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ALGORITHM 2 : Pseudo code for finding the minimal itemsets within the dataset 
D = {Hi, D2,..., D n } by using the lexicographic constraint (Theorem | 2 . 1 [ ). A re¬ 
production of the AMS-Lex algorithm described by Bayardo and Panda, but used for 
finding the minimal itemsets rather than the maximal ones, i.e. for finding the mini¬ 
mal itemsets within the dataset D we do not mark the subsumed itemsets but rather 
mark an itemset as non-minimal if it is a superset another one. 

Function Get-Minimal-Itemsets-Lex(D) 

Input: Dataset D = {D 1, D2, ■ ■ ., D n } that is ordered lexicographically and 
every itemset Di £ D is also ordered lexicographically. 

Output: The minimal itemsets within the dataset D = {D\, D2 ,..., D n }. 

1 bool isjnin[n] <— {true, true,..., true}; 

/* Find itemsets subsumed by proper prefix. */ 


2 



3 


for i = 2 to n do 


4 


if | S\ < \D\i\\ & D\i\[ 1 : | S'!] = S then 


/* S is a proper prefix of D[i\. 


*/ 


5 


isjmin[i] <—> false; 

end 

else 


6 



7 


end 

end 

/* Find itemsets subsumed by non-proper prefix, 
for i = 1 to n — 1 do 


*/ 


8 


then 

isjrnin[i] <— false; 


if isjmin[i] & Contains-Subset-Of (D[i + 1 : n], D[i], 1, 0) /* see 
Algorithm [l] 


9 


10 


end 

end 

return {Di € D \ isjmin[i\ = true}; 
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(a) Call graph of Contains-Subset-Of for the itemset D\ = abc 
over the dataset D 
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m=2,t=False 


(b=3,e=3,j=3,d=2) 

m=3,t=False 


(b=4,e=4,j=3,d=l) n2 
m=3,t=True 


(b) Call graph of Contains-Subset-Of for the itemset D 2 = abde (c) Call graph of Contains- 
over the dataset D Subset-Of for the itemset 

D 3 = abdf over the dataset 
D 


(b=5,e=5 

m=2,t 

=l,d= 

=False 



| False | 


(d) Call graph 
of Contains- 
Subset-Of for 
the itemset 
D 4 = bd over 
the dataset D 


Figure 1: The call graphs of the AMS-Lex [3] algorithm for the function 
Contains-Subset-Of over the dataset D = {D 3 = abc,D 2 = abde,D 3 = 
abdf,D± = bd,D 5 = c}. All graph nodes(ni) and edges(ej) are labelled in 
the order of executions — first is ni, then n 2 , then n 3 , etc. The AMS-Lex 
algorithm is presented in Algorithm [2] and Contains-Subset-Of in Algorithm [Tj 
Further explanation of these call graphs can be found in Section |2.1.1| 
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ALGORITHM 3: Pseudo code for finding if the dataset D = {D\, D 2 ,..., D n } 
contains a proper subset of D[i\ by using memoization — the call graph node v £ G(S) 
and the common prefix between Di and S. 

Function Contains-Subset-Of-Memoized(.D, i, p, v) 

Input: The dataset D = {Hi, D 2 , ... , D n }. The parameter i specifies that we 
trying to find if a proper subset of the itemset D[i\ exists within 
D[i + 1 : n]. The input also contains a call graph node v £ G(S) for some 
itemset S and same dataset D; and the integer p — the size of the 
longest common prefix between S and D[i], 

Output: Returns true iff there exists a proper subset of D[i] within D[i + 1 : n], 
and false otherwise. 

1 if v.m > p then 

/* The maximum index that was accessed from the method 

Contains-Subset-Of in the memoized iteration represented by v 
is larger then the size of the common prefix, so we must invoke 
the the function to find the non-proper subsets of D[i] as no 
more memoized results can be used. */ 

2 b 4— max(v.b, i + 1); 

3 if & < v.e then 

/* We assume a modified version of the function 

Contains-Subset-Of which returns a pair consisting of a 
boolean variable and a node representing the call stack of 
the function. */ 

4 ( res,v ) <— Contains-Subset-Of (D[b : v.e], D[i\,v.j,v.d) - , 

5 return res; 
end 

6 v <— null; 

7 return false; 

end 

8 if v.C\ ^ null then 

9 if Contains-Subset-Of-Memoized(Z), v.Ci, i,p) then 

10 res <— true; 

end 

end 

11 if V.C2 ^ null then 

12 if Contains-Subset-Of-Memoized(R, V.C2, i,p) then 

13 res <— true; 

end 

end 

/* recall that v.t equals true iff a subset was found in the execution 
of the function Contains-Subset-Of without considering the 
recursive calls; i.e. there exists a non-proper subset of D[i] of 
size smaller then the length of the common prefix p between S and 
D[i ]. */ 

14 return v.t; 
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ALGORITHM 4: Pseudo code for finding the minimal itemsets within the dataset 
D = {D i, D 2 , • • •, Dn} by using memoization and the lexicographic constraint (Theo¬ 
rem 


2 . 1 ). 


1 

2 

3 

4 

5 


6 


7 

8 
9 

10 

11 


12 

13 

14 


15 


16 

17 


18 
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Function Get-Minimal-Itemsets-Lex-MemoizedCD) 

Input: Dataset D = {Di, D 2 ,..., D n } that is ordered lexicographically and 
every itemset I E D is also ordered lexicographically. 

Output: The minimal itemsets within the dataset D. 
bool isjmin[n ] <— {true, true,..., true}', 

/* Find itemsets subsumed by proper prefix. */ 

S«—£>[1]; 

for i = 2 to n do 

if |SI < \D[i]\ & D\i][ 1 : |S|] = S then 

/* S is a proper prefix of D[i\. */ 

isjmin[i] <—> false; 

end 

else 

S^D[i\; 

end 

end 

/* Find itemsets subsumed by non-proper prefix. */ 

S <— null; 
v i — null; 

for i = 1 to n — 1 do 

if isjmin[i] then 
if v = null then 

/* defined in Algorithm [l] but assuming that it returns a 
pair of a boolean value res and the call stack 
represented by t). */ 

{res, v) <— Contains-Subset-Of (D[i + 1 : n], D[i], 1, 0); 
if res then 

isjmin[i\ <— false; 
end 
end 
else 

/* largest common prefix of S and D[i\. */ 

p i — max({ 1 <j< min(\D[i]\,\S\) \ D[i][ 1 : j] = S[1 : j]}); 

/* note that the function Contains-Subset-Of-Memoized 

modifies the node v. */ 

if Contains-Subset-Of -Memoized(D, i,p, v) then 
isjmin[i\ <— false; 
end 
end 

S «—£>[*■]; 

end 

end 

return {Di € D \ isjmin[i] = true}; 
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(a) The memoized call graph v after processing the itemset 
D\ = abcover the dataset D 
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(b) The memoized call graph v after processing the 
itemset D 2 = abde over the dataset D 
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(c) The memoized call graph v after processing the (d) The memo- 
itemset D 3 = abdf over the dataset D ized call graph 

v after process¬ 
ing the itemset 
D 4 = bd over 
the dataset D 


Figure 2: This figure presents the evaluation of the memoized version of AMS- 
Lex over the same dataset D as presented in Figure [l] Here we show exactly 
which parts of the graph are memoized — the shaded nodes. Each sub-figure 
shows the memoized call graphs Vk as per Algorithm [4] after processing every 
itemset Dk from the dataset D = {Di = abc,D 2 = abde,D 3 = abdf,D 4 = 
bd,D 5 = c}. All graph nodes and edges are labelled in the order of executions 
— first is rii, then n 2 , then n 3 , etc. The solid nodes in the graphs are evaluated 
using Algorithm [l] and the shaded nod<^ are memoized. Further explanation of 
these call graphs can be found in Section[3.2.1 
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ALGORITHM 5: Pseudo code for finding the minimal itemsets M of the input 
dataset D = {D\, D 2 ,..., D n } using P threads. We present a subroutine Find-Min- 
Lex which identifies the minimal itemsets of D using P parallel threads. It is important 
to note that in the Thread-Functor subroutine the variables index and isjmin are 
passed by reference, meaning that they are shared between threads. 

Input: Dataset D = {D 1 , D 2 ,..., D n } and the degree of parallelism P. 

Output: The minimal itemsets within the dataset D. i.e. Min(D). 

Function Get-Minimal-Itemsets-Lex-Parallel (dataset D, integer P) 
atomic < bool > isjmin[r] <— {true, true ,..., true}', 

/* atomic boolean variables. */ 

/* Find itemsets subsumed by proper prefix. */ 

S <— D[l]; 

for i = 2 to n do 

if 151 < \D[i\\ & D[i\[ 1 : |5|] = S then 

/* S is a proper prefix of D[i]. */ 

is-min[i] <—> false; 

end 

else 

£«—£>[*■]; 

end 

end 

/* Find itemsets subsumed by non-proper prefix using P parallel 

threads. */ 

atomic < int > index <— 1; 

/* the index that is to be processed next. */ 

start P parallel instances of Thread-Functor (D , index, is jnin); 
wait for all P instances to finish working; 
return {Di € D \ isjmin[i\ == true}; 

Function Thread-Functor (dataset D, atomic < integer > index, 
atomic < bool > m[r]) 

i i — fetch-and-increment(mdea;); 

/* an atomic operation */ 

while i < n do 

/* It is safe to invoke the function Contains-Subset-Of from 

multiple threads at the same time as it requires only read-only 
access to the dataset D. */ 

if Contains-Subset-Of (D[i + 1 : n], D[i], 1, 0) /* as per Algorithm [T] */ 

then 

/* mark the i-th itemset as non-minimal because the dataset D 
contains a proper subset of the itemset D[i]. */ 

m[i] <— false; 

/* atomically setting the i-th boolean value. */ 

end 

i i — fetch-and-increment(mdea:); 

end 
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Real World: DBLP Input Dataset 



AMS-Lex Memoized P=4 P=8 P=16 P=32 
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Real World: Pubmed Input Dataset 



AMS-Lex Memoized P=4 P=8 P=16 P=32 

Algorithm 

Real World Dataset: 9-input Sorting Networks of Depth 4 




AMS-Lex Memoized P=4 P=8 P=16 P=32 


Algorithm 


Figure 3: Experimental results using real world datasets, comparing AMS-Lex 
with the memoized (section [3]) and parallel (section [4]) approach for finding the 
minimal itemsets within a dataset. For these results we have used a machine 
with 32 physical cores and used parallelism factors P = 4,8,16 and 32 for our 
parallel modification of AMS-Lex. 
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Synthetic Data for n = 100,000 for Memoized Over AMS-Lex 



0 0.2 0.4 0.6 0.8 1 

Minimal Item Frequency (f m i n ) 


Synthetic Data for n = 100,000 for Memoized Over AMS-Lex 
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Synthetic Data for n = 500,000 for Memoized Over AMS-Lex 



Synthetic Data for n = 1,000,000 for Memoized Over AMS-Lex 
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Synthetic Data for n = 1,000,000 for Memoized Over AMS-Lex 



Figure 4: Experimental results using synthetic data for n = 100 000, n = 500 000 
and n = 1 000 000 of comparing our memoized version of AMS-Lex (section [3]) 
over AMS-Lex for finding the minimal itemsets within a dataset. Here d is the 
cardinality of the domain of the itemsets. These results show the minimal item 






















Synthetic Data for n = 100,000 for Memoized Over AMS-Lex 
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Synthetic Data for n = 500,000 for Memoized Over AMS-Lex 



Synthetic Data for n = 1,000,000 for Memoized Over AMS-Lex 



Figure 5: Experimental results using synthetic data for n = 100 000, n = 500 000 
and n = 1 000 000 of comparing our memoized version of AMS-Lex (section [3]) 
over AMS-Lex for finding the minimal itemsets within a dataset. Here d, is the 
cardinality of the alphabet. These results show the number of minimal item- 
sets against the resulting execution time speedup of our memoized algorithm 
compared to AMS-Lex. 
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Synthetic Data for n = 1,000,000 for Multithreaded!? = 4) Over AMS-Lex 
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Synthetic Data for n = 1,000,000 for Multithreaded^ = 8) Over AMS-Lex 
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Synthetic Data for n = 1,000,000 for Multithreaded^ = 16) Over AMS-Lex 
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Synthetic Data for n = 1,000,000 for Multithreaded^ = 32) Over AMS-Lex 



Minimal Item Frequency (fmin) 


Figure 6: Experimental results for synthetic data for n = 1 000 000 of comparing 
our parallel version of AMS-Lex over AMS-Lex for finding the minimal itemsets 
within a dataset. Here d is the cardinality of the domain of the itemsets. These 
results show the minimal item frequency described in Section [5] against the 
resulting execution time speedup of our parallel algorithm compared to AMS- 
Lex. For these results we have used a machine with 32 physical cores and used 
parallelism factors of 4, 8, 16 and 32 for our parallel modification of AMS-Lex 
described in section [4] 
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